A statistical approach for identifying primary substrates of ZSWIM8-mediated microRNA degradation in small-RNA sequencing data

Background One strategy for identifying targets of a regulatory factor is to perturb the factor and use high-throughput RNA sequencing to examine the consequences. However, distinguishing direct targets from secondary effects and experimental noise can be challenging when confounding signal is present in the background at varying levels. Results Here, we present a statistical modeling strategy to identify microRNAs that are primary substrates of target-directed miRNA degradation (TDMD) mediated by ZSWIM8. This method uses a bi-beta-uniform mixture (BBUM) model to separate primary from background signal components, leveraging the expectation that primary signal is restricted to upregulation and not downregulation upon loss of ZSWIM8. The BBUM model strategy retained the apparent sensitivity and specificity of the previous ad hoc approach but was more robust against outliers, achieved a more consistent stringency, and could be performed using a single cutoff of false discovery rate (FDR). Conclusions We developed the BBUM model, a robust statistical modeling strategy to account for background secondary signal in differential expression data. It performed well for identifying primary substrates of TDMD and should be useful for other applications in which the primary regulatory targets are only upregulated or only downregulated. The BBUM model, FDR-correction algorithm, and significance-testing methods are available as an R package at https://github.com/wyppeter/bbum. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-023-05306-z.


Introduction
Differential expression (DE) analyses seek to identify gene products that change in abundance after either altering a condition or perturbing a regulatory factor. Aiding in these analyses are statistical pipelines, such as DESeq2 [1], edgeR [2], and limma [3], which compare RNA-seq or microarray datasets to identify RNAs with significantly changed levels, after correcting for false discovery rate (FDR) due to multiple testing using the Benjamini-Hochberg procedure [4]. These pipelines have been invaluable for DE analyses of mRNAs as well as noncoding RNAs.
Noncoding RNAs often subject to DE analyses include the microRNAs (miRNAs), which are small RNAs that direct widespread post-transcriptional repression of metazoan mRNAs [5]. To perform this function, miRNAs associate with the effector protein Argonaute (AGO) to form a complex in which the miRNA specifies which targets are repressed, primarily through base pairing between the seed of the miRNA (miRNA nucleotides 2-7) and a site in the target mRNA [6]. Meanwhile, AGO causes repression, typically by recruiting the cytoplasmic mRNA deadenylation machinery [7].
Most miRNAs are quite stable, with half-lives much greater than those of typical mRNAs, presumably a consequence of their association with AGO, which protects miRNAs from cellular nucleases [8,9]. However, some miRNAs are more rapidly degraded, and in cells of both mammals and flies, this more rapid degradation appears to be the result of target-directed miRNA degradation (TDMD) [10]. TDMD is a phenomenon in which targets with unusual complementarity cause a conformational change recognized by the ZSWIM8 E3 ubiquitin ligase, which polyubiquitinates the AGO protein, leading to its degradation by the proteasome, thereby exposing the miRNA to degradation by cellular nucleases [10,11]. Through DE analysis of small-RNA sequencing (sRNA-seq) data acquired after reducing ZSWIM8 in different cell types, more than 40 candidate miRNA substrates of ZSWIM8 have been identified [10].
When identifying candidate substrates of ZSWIM8, standard DE analysis is not sufficient to distinguish between miRNAs that are significantly upregulated due to the primary effect of losing ZSWIM8-mediated TDMD, and those with significantly perturbed expression due to secondary effects, such as transcriptome changes caused by the dysregulation of miRNAs or other changes that might be caused by the loss of ZSWIM8. To exclude miRNAs changing due to secondary effects, Shi et al. [10] use the knowledge that ZSWIM8 mediates degradation of miRNA substrates, which implies that these substrates should undergo only upregulation upon the loss of ZSWIM8. Accordingly, the significance cutoffs (α values) of FDR-adjusted p values from DESeq2 (p adj ) are each adjusted down to the most permissive level that excludes all downregulated miRNAs. As a result, these ad hoc adjustments of α values vary widely, ranging from 0.05 to 10 -7 for different datasets analyzed (Fig. 1A) [10]. Although this approach seems better than classifying any significantly upregulated miRNA as a ZSWIM8 substrate, it has several shortcomings: (1) it is unduly sensitive to outliers among downregulated miRNAs, which can reduce sensitivity; (2) FDRs are inconsistent among experiments and cannot be predetermined; and (3) the FDR and specificity of each analysis depend on the sample size.
Here, we developed a statistical modeling strategy that accounts for varying secondary effects in RNA-seq datasets, thereby enabling primary substrates to be identified at a consistent predetermined statistical stringency. Compared to the previous approach, this strategy achieved more robust results with fewer shortcomings. This strategy should also provide an improved strategy for identifying direct targets of other types of regulatory pathways.  1 Statistical modeling of secondary effects in DE datasets. A Representative plots of fold changes in miRNA levels observed upon ZSWIM8 knockout, as measured by small-RNA sequencing. Points for miRNAs with significant upregulation based on the cutoffs defined by Shi et al. [10] are colored in red. Points for miRNAs that did not meet the adjusted cutoffs but would be significant based on a common cutoff of 0.05 are colored in blue. The number of miRNAs and passenger strands quantified in each dataset is indicated. B Histograms of raw p values for upregulated and downregulated miRNAs and passenger strands analyzed in datasets of A. The peak near p = 0, which indicates true signal above null, is indicated by an orange caret for downregulated miRNAs in each dataset.

Significant signal among downregulated miRNAs implies secondary effects
In published datasets, a standard p adj cutoff at α = 0.05 was suitable for identifying primary substrates in some contexts, such as HeLa cells [10]. However, in other contexts, such as contact-inhibited mouse embryonic fibroblasts (MEFs) or induced mouse neurons (iMNs), the same cutoff would have classified many miRNAs with changes that seemed no different than background as primary substrates (Fig. 1A). These differences observed between datasets, which motivated the use of varying stringency of α values, were presumably due to varying levels of secondary effects. To evaluate this idea, we examined the downregulated miRNAs, reasoning that because loss of ZSWIM8 should only cause upregulation of primary miRNA substrates, any signal among downregulated miRNAs in excess of that expected by chance implied the existence of true secondary effects. To search for this signal, we examined p-value distributions. A set of data points drawn from a null distribution is expected to have a uniform distribution of p values, ranging from 0 to 1, as the cumulative fraction of points called as false positives should equal to α for all values of α. Thus, a significant signal should manifest as a peak of enriched density near p = 0 [12]. For each of the three datasets of Fig. 1A, the distribution of raw p values from DESeq2 was examined, looking separately at the results for upregulated and downregulated miRNAs. As expected for datasets that included ZSWIM8 substrates, distributions for upregulated miRNAs peaked near p = 0 (Fig. 1B). In addition, for the two datasets that required a more stringent α value, the distributions for downregulated miRNAs also peaked near p = 0, albeit at a level lower than that observed for upregulated miRNAs (Fig. 1A, B). These results supported the idea that some miRNAs were truly downregulated upon the loss of ZSWIM8, likely as a result of secondary effects, and the idea that contexts with stronger secondary effects required stronger adjustments of stringency.

Statistical modeling of p values enables the separation of primary and secondary effects
If secondary effects acted symmetrically, causing miRNAs to increase as well as decrease (at similar frequency and similar magnitudes), then the excess in the peak near p = 0 observed for upregulated miRNAs compared to that observed for downregulated miRNAs should correspond to the density of true primary substrates of ZSWIM8 (Fig. 1C). Accordingly, we developed a statistical strategy to separate the components of the p-value distribution to better classify the primary substrates and the secondary effects. Our strategy made three assumptions: (1) primary effects were stronger than secondary effects, (2) secondary effects were approximately symmetrical between upregulated and downregulated data points, and (3) primary effects caused upregulation and never downregulation. Previous studies have described the successful use of a beta-uniform mixture (BUM) distribution model and its variants to model p-value distributions between 0 and 1 [13][14][15]. In these studies, the uniform component represents the distribution of null data points, while the beta component represents the characteristic peak near p = 0. Building upon these concepts, we implemented a modified mixture distribution model, which we call the bi-beta-uniform mixture (BBUM) model, to describe our p values. The BBUM distribution contains two beta components [Beta(ar, 1) and Beta(a, 1)], instead of one, which respectively correspond to the primary and secondary effects, followed by a similar uniform component [U(0, 1)] for the null distribution (Fig. 1C).
The p values were fit to this mixture model, with the downregulated data points fit to a model that lacked the primary-effect component (Fig. 1C).
The model faithfully captured the distributions of p values from both halves of each dataset, especially for the p-value density near 0 (Fig. 2). As expected, datasets that required more stringent α values for p adj (Fig. 1A) were modeled with more pronounced beta components for secondary effects, as indicated by the greater deviation between the null distribution and the model fit for downregulated points (Fig. 2, orange shading). These results indicated that for these datasets the model was able to separate primary and secondary effects.

Significance testing after BBUM adjustment predicts direct substrates of TDMD
Because the model was able to represent the distribution density that corresponded specifically to the primary effects, the expected FDR could be calculated at any desired cutoff among the p values considered, which we defined as the BBUM-FDRadjusted p value (p BBUM ). Choosing a p BBUM cutoff of 0.05, we reanalyzed the datasets from mammalian and fly cell lines from Shi et al. [10] to identify ZSWIM8 primary substrates. Across all nine datasets examined, the proposed primary substrates identified using the BBUM strategy largely matched those identified previously, while imposing a consistent, predetermined FDR cutoff (Fig. 3A). Out of the 75 instances classified as significant at this cutoff, four were newly classified as significant. Three of these four involved miRNAs that were either also proposed to be ZSWIM8 substrates in other contexts or related to another proposed ZSWIM8 substrate, which supported the idea that these four miRNAs included true ZSWIM8 substrates (Fig. 3A, Additional file 1: Table S1). This idea was further reinforced by analysis of the passenger strands of these candidate miRNAs. During miRNA biogenesis, the pre-miRNA hairpin is processed into a miRNA duplex containing the miRNA paired to its passenger strand. When this duplex associates with AGO, the miRNA strand is retained, Deviation between the distribution for downregulated miRNAs and the null distribution is shaded in light orange; a greater deviation at the lower tail indicates a greater excess in density near p = 0, which corresponds to a more substantial contribution of the beta component for secondary effects in the BBUM model and the passenger strand is ejected and rapidly degraded. Because TDMD acts upon mature AGO-miRNA complexes, the miRNA strand and not the passenger strand should be affected by the loss of ZSWIM8 [10]. Indeed, each of the four newly significant miRNAs were upregulated upon the loss of ZSWIM8 without significant change in their passenger strands, as observed for other miRNAs predicted to be ZSWIM8 substrates (Fig. 3B).
Three other edge cases proposed to be ZSWIM8 substrates by the previous method were not identified at p BBUM < 0.05 when using the BBUM model (Fig. 3A, Additional file 1: Table S1). One of these was miR-7-5p in clonal ZSWIM8 knockout cells. This known TDMD substrate [16] was not sufficiently upregulated in knockout K562 cells to reach significance over the relatively high background variability of this dataset (Fig. 3A). Nonetheless, statistical significance was readily achieved for miR-7-5p in K562 cells when using datasets in which ZSWIM8 was knocked down using CRISPRi, which led to much lower background variability than observed with clonal knockout cells (Fig. 3A). Likewise, in a dataset analyzing a different ZSWIM8-knockout line (Han et al., 2020), miR-7-5p upregulation met the significance threshold using the BBUM approach (Fig. 3C).
Thus, on the whole, using the BBUM model, candidate primary substrates of ZSWIM8 were identified while implementing a consistent and predetermined FDR confidence value across all cellular contexts examined, without noticeably sacrificing the apparent sensitivity or specificity of the previous approach. We attribute this success to the ability of the BBUM model to adjust the varying spread of background p values to a consistent range (Fig. 3D).

BBUM correction applies a consistent statistical stringency
To benchmark the performance of our approach, we randomly generated simulated datasets of p values containing varying levels of primary and background secondary signal under the BBUM distribution. We compared the empirical FDR of our BBUM strategy, using p BBUM < 0.05 as the significance cutoff, against that of the method used previously by Shi et al. [10]. The BBUM approach had a mean FDR of 0.050 ± 0.002 (95% confidence interval (CI)) ( Fig. 4A). The Shi et al. method produced a comparable mean FDR Fig. 3 Identification of candidate ZSWIM8-sensitive miRNAs using the BBUM method. A Plots of log 2 fold changes in miRNA levels observed upon loss of ZSWIM8 in all mammalian and fly cell lines examined in Shi et al. [10]. Two datasets were derived from CRISPRi knockdown (KD) of ZSWIM8 in K562 cells, using one of two different guide RNAs (A or B). All other datasets were derived from knockout (KO) of either ZSWIM8 or Dora, the Drosophila ZSWIM8 ortholog. Points for miRNAs that met the common p BBUM significance cutoff of 0.05 are in red, with n indicating the number passing this cutoff, shown as fraction of the total number of miRNAs and passenger strands quantified. Points for passenger strands of these miRNAs are in cyan, if the passenger strands were both annotated and observed above the expression threshold. Points for miRNAs with classifications differing from that of previous work [10] are outlined in black (n = 7). Points for miR-7-5p, a known TDMD substrate [16], are labeled. B Different effects of the loss of ZSWIM8 on miRNAs and their passenger strands. Fold changes in miRNA levels are shown for two datasets with newly significant miRNAs. Only miRNAs found significant by the BBUM method and whose passenger strands were quantified in the dataset are shown, with each miRNA paired with its passenger strand(s). Points for newly significant miRNAs are outlined and labeled. C Plot of fold changes in miRNA levels observed upon knockout of ZSWIM8 in K562 cells by Han et al. [11]. Points for two downregulated outliers of 0.036 ± 0.002 but was less constant, as measured by the coefficient of variation (CV) of the FDR, which was 2.47 ± 0.11 for the previous method, compared to 1.55 ± 0.08 for the BBUM method. Thus, BBUM correction produced an accurate and consistent FDR when evaluated using simulated datasets.
BUM models are reported to be sensitive to outliers with extremely small p values due to the asymptotic behavior at zero of the likelihood function of the type of beta distribution used [14]. Indeed, we found that adding artificial extreme outliers to downregulated miRNAs in either empirical or simulated datasets could cause the BBUM procedure to overcorrect for secondary effects. The previous approach by Shi et al. [10] was even more prone to overcorrection, with a single extreme outlier able to cause all of the upregulated miRNAs to be designated as background. The influence of outliers was also illustrated in the Han et al. [11] dataset for K562 cells. Two extreme outliers within this dataset prevented any miRNAs to be designated as    Fig. 4 Benchmarking of correction procedures and significance-testing methods. A Histograms of the empirical FDR for 6000 simulated p value datasets with no outliers. Results are plotted for the extreme-value method used by Shi et al., the original BBUM procedure without outlier trimming at p BBUM < 0.05, and the modified BBUM procedure with automatic outlier trimming at p BBUM < 0.05. The frequency of the density near 0, which exceeded the y-axis scale, is indicated alongside its fraction among all trials for each histogram. The FDR cutoff of the BBUM methods at 0.05 is indicated on the x-axis in orange. The mean FDR of all simulation trials is indicated on the right with corresponding 95% CIs by bootstrapping. The CV of FDR is similarly indicated. B As in A, but for 6000 simulations with at least one programmed outlier. The mean FDR and mean sensitivities are indicated with corresponding 95% CIs by bootstrapping. primary ZSWIM8 substrates when using the approach of Shi et al. [10] and severely weakened the significance of miR-7-5p when using the BBUM model (Fig. 3C).
To mitigate the effects of outliers, we developed a conservative outlier detection method that used the fitted r parameter of the BBUM model to identify and trim probable outliers from downregulated data points. The performance of the modified BBUM procedure was similarly benchmarked using simulated p-value datasets, with and without added outliers. In datasets without added outliers, the modified BBUM procedure did not have significantly increased mean FDR (0.051 ± 0.002) or CV of FDR (1.60 ± 0.09) when compared to the original BBUM procedure (Fig. 4B). In datasets with added outliers, the modified BBUM procedure was somewhat improved over the unmodified procedure, with mean FDR increasing from 0.016 ± 0.001 for the original BBUM procedure to 0.021 ± 0.002 for the modified procedure and mean sensitivity increasing from 0.39 ± 0.01 to 0.44 ± 0.01 (Fig. 4B). Importantly, both the original and the modified BBUM procedures were less sensitive to outliers than the previous method of Shi et al. [10], which had a mean FDR of 0.006 ± 0.001 and mean sensitivity of 0.25 ± 0.01 in the presence of one or more outliers (Fig. 4B). Moreover, the modified BBUM procedure successfully identified and trimmed the two outliers in the K562 dataset (Fig. 3C), as well as any artificial outliers we added to other empirical datasets.
The ad hoc method by Shi et al. [10] provides a fixed stringency for each dataset. In contrast, the BBUM method allows the stringency to be quantitatively tuned by choosing the significance cutoff for p BBUM to suit the needs of the experiment or hypothesis at hand. Therefore, we extended our benchmarking analysis to a range of possible significance cutoffs for p BBUM , and assessed the accuracy of the BBUM method at each cutoff α BBUM in our simulations. When no outliers were present, both the original and the modified BBUM methods faithfully achieved results at the predetermined FDR at all α BBUM values tested (Fig. 4C). When one or more outliers were present, the modified BBUM method partially mitigated the overcorrection of secondary effects by the original BBUM method at all α BBUM values tested, especially when α BBUM was small, which was the condition in which overcorrection was most severe in the original BBUM method (Fig. 4C). In fact, the overcorrection of the modified BBUM procedure, tested across a wide range of α BBUM values, was no worse than that seen at α BBUM = 0.05, where the mean empirical FDR was 0.021 ± 0.002 (Fig. 4B, C). In no cases did this conservative outlier detection method adjust the FDR to above the intended stringency (Fig. 4C). Hence, the BBUM procedure performed robustly in both ideal and non-ideal scenarios and performed significance testing with improved consistency and flexibility, as well as lower sensitivity to outliers.

BBUM correction can be applied to other RNA-seq datasets
Encouraged by the success of our approach for analyzing sRNA-seq datasets examining the effects of ZSWIM8 knockout/knockdown on miRNA levels, we applied the approach to another type of experimental dataset. MicroRNAs invariably reduce the levels of their primary regulatory targets [5,17]. Thus, RNA-seq datasets examining the changes in mRNAs observed upon introducing a miRNA seemed appropriate for BBUM analysis, in that the primary effects were expected to be in one direction, whereas the secondary effects and background variability were expected to be symmetrically distributed between both up-and downregulation. To test whether BBUM might help identify mRNAs most likely to be directly targeted, we analyzed 29 RNA-seq datasets in which a miRNA was transfected into either HeLa or HEK293FT cell lines [18]. As observed for the sRNA-seq results, a standard DESeq2 analysis using α = 0.05 as the cutoff for p adj classified many RNAs as differentially expressed following miRNA transfection-some downregulated but others upregulated (Fig. 5A). For most datasets, BBUM FDR correction robustly identified mRNAs that were the most significantly downregulated as top candidates for the primary targets of these miRNAs (Fig. 5A, B). In two out of the 29 datasets (miR-139 and miR-144 in HeLa cells), the BBUM approach failed to identify any direct-target candidates because no data points passed our false-discovery cutoff of 0.05 for p BBUM (Fig. 5; Additional file 1: Fig. S1). Across the remaining 27 datasets, the proportion of primary and secondary effects appeared to span a wide range, and the BBUM analysis helped to characterize these differences. For some datasets (e.g. miR-122 in HEK293FT cells), primary effects appeared to far exceed secondary effects. Indeed, for 11 datasets, no secondary effects were detectable, despite a clear signal for primary target repression (with 19-91 genes classified as significantly repressed). In contrast, three datasets had hundreds of genes called as significantly influenced through secondary effects, far exceeding the number of genes under significant primary effects ( Fig. 5; Additional file 1: Fig. S1). Despite the broadly varying levels of secondary effects, the BBUM approach consistently adjusted p values to the same level of stringency across the 29 datasets tested (Fig. 5B, C). Thus, the approach can be applied to differential expression datasets with widely varying levels of secondary effects.

Discussion
Our BBUM method will help to more rigorously identify miRNAs subject to TDMD and shows promise for detecting the relative contributions of primary and secondary effects when analyzing miRNA-mediated regulation. We suspect it will also be useful in other analyses in which the primary effect of a regulatory phenomenon causes changes in one direction, whereas secondary effects and background variability cause changes in either direction. Other potential uses include mRNA analyses identifying the targets of transcriptional inhibitors or proteomic analyses identifying the targets of ubiquitin ligases or other degradation phenomena.
We envision some cases in which the use of our BBUM approach would be limited: (1) If there is an overwhelming degree of secondary effects, the signal-to-noise ratio might not be sufficient for the primary and secondary signal to be separated, leading to low sensitivity. (2) If the data are not normalized to spike-ins, and a substantial number of genes significantly change in one direction, the assumption that the background data points are symmetrical about a fold-change of zero might not be fully satisfied. For example, a mild bias towards downregulation was observed for our dataset examining the effect of ZSWIM8/Dora in S2 cells, in which more than 10% of miRNAs were significantly upregulated (Fig. 3A). Nonetheless, our method was empirically robust against this effect.
Although this work applied our method to datasets downstream of DESeq2, other common differential expression analysis pipelines, such as edgeR and limma, should also be compatible. As the BBUM correction is agnostic to the source of p values, we expect that any p-value datasets with similar behaviors and constraints should also be  Fig. S1). Points for mRNAs that met both the significance cutoff by the BBUM method and the default Benjamini-Hochberg method (p adj ) are colored red. Those that met the cutoff for the BBUM method but not p adj are colored brown. Those that met the cutoff for p adj but not the BBUM method are colored blue; these points roughly correspond to genes subject to significant secondary effects. Outliers identified by the BBUM model are colored in magenta. Number of points belonging to each category and the total number of points plotted are indicated for each plot. B Volcano plots of raw p values as a function of mRNA log 2 fold change for the same datasets analyzed in A. Colors and order are as in A. Red data points falling beyond the plotted range are omitted. C Proportions of mRNAs in each category for each of the 29 datasets analyzed (Additional file 1: Fig. S1). Colors are as in A and B. Datasets are arranged in descending order of the number of data points called as significant after BBUM FDR adjustment. Bars corresponding to the four representative datasets shown in A and B are indicated equally applicable, as was the case for our simulated p values. Combining BBUM correction with current statistical procedures can allow the significance testing of more specific hypotheses and a more standardized significance cutoff across different datasets and experiments.

Analysis of sRNA-seq results
All sRNA-seq datasets analyzed contained three replicates each for the wildtype and the knockdown/knockout conditions, with two exceptions: datasets for wildtype MCF7 cells and ZSWIM8-knockout iMNs had only two replicates. Read counts for the Shi et al. datasets were downloaded directly from published data, while the K562 dataset by Han et al. was reanalyzed using an identical pipeline. Sequencing reads were matched to the first 19 nt of mature miRNA sequences downloaded from TargetScan7 [19] to generate read counts for each annotated miRNA and its passenger strand. We note that the miRNA sequences of TargetScan include a set of "other miRBase annotations, " which includes most of the miRNA passenger strands as well as hundreds of false-positive annotations [20,21]. Shi et al. [10] had included this set of other miRBase annotations because it contained most of the miRNA passenger strands, which serve as useful internal standards for secondary effects. We opted to use the same list of miRNA annotations as Shi et al. so that we could better compare our results to those of Shi et al. Including these false-positive annotations was not expected to affect our analysis for two reasons: (1) most of the false-positive annotations had too few reads to pass our expression threshold for downstream analysis, and (2) the false-positives that passed our expression threshold were not expected to be sensitive to ZSWIM8 loss. Indeed, examination of all the ZSWIM8-sensitive miRNAs identified by either our new procedure or the previous procedure of Shi et al. confirmed that they were each bona-fide miRNAs, as annotated in MirGeneDB [20]. After mapping to miRNA annotations, changes and p values were calculated using DESeq2 (v1.32.0) [1], using the default Wald test method, without the lfcShrink() function, and with independentFiltering = FALSE in results(). All subsequent analyses were filtered to consider only miRNAs that had at least five matched reads in each replicate of each treatment of each dataset. To conduct BBUM FDR adjustment, the modified BBUM method was used with all default settings in the BBUM_DEcorr function of the bbum R package from this work. Upregulated miRNAs were used as the signal set and downregulated miRNAs were used as the background set. Passenger strands were as annotated in TargetScan7. If no passenger strand was annotated for a miRNA, or if the annotated passenger strand did not meet the read-count cutoff, its fold-change was not plotted.

Significance testing by the method of Shi et al. [10]
To call changes as significant by the method of Shi et al. [10], the default FDR-adjusted p values calculated by DESeq2 using the Benjamini-Hochberg procedure (p adj ) were used. The p adj threshold for significance was chosen as the most permissive value out of a defined sequence of canonical critical values (0.05, 0.01, 0.001, 0.0001,…) that excluded all downregulated miRNAs in the dataset. For example, in iMNs, the strongest downregulated miRNA had a p adj value of 0.000262, and thus the threshold for that dataset was adjusted to 0.0001.

The BBUM statistical model of p values
The p values from the DE experiments can be reasonably modeled as a random variable X following the BBUM distribution: the probability density function (PDF) of the model is defined as: likewise, the cumulative distribution function (CDF) is defined as: for 0 < p ≤ 1 ,0 < < 1, 0 < a < 1, 0 < θ < 1, and 0 < r < 1 . λ represents the fraction of null distribution density over all density except the primary signal, θ represents the fraction of primary signal distribution density over all density, and a describes the shape of the secondary signal peak. r describes the ratio between the shape parameters of the primary and secondary signal components, such that the shape of the primary signal peak is described by ar. The PDF asymptotes to infinity as p approaches 0, and monotonically decreases as p goes from 0 to 1.

BBUM model fitting and parameter estimation
Given a set of p values p, the BBUM distribution model was fit to p using a modified maximum likelihood estimation (MLE) method, which varies the parameter values until the calculated (log-)likelihood function value is at its maximum. While varying a shared set of parameters, p values associated with upregulated miRNAs, p + , were fit to the full BBUM function, whereas p values associated with downregulated miRNAs, p − , were fit to the same BBUM function with θ fixed at 0 and r fixed at 1, which corresponded to a zero component for primary effects (Fig. 1C). The total log-likelihood function value ℓ was used as the maximization target for MLE fitting and was defined as the sum of loglikelihood values of the two halves: p values of 0 may appear in datasets as a result of underflow, due to the computational approximation of very small values. Due to the asymptotic behavior of the likelihood function when any p values in p equals to 0, any p values smaller than 10 × the machine limit in R (.Machine$double.xmin, which was 2.23 × 10 −308 in the implementation of this work) were adjusted to 10 × the machine limit to avoid this issue.
Fitting was performed using the optim() function in R using the default Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm for a maximum of 200 iterations. Unless otherwise stated, parameters λ, a, and r were bounded to (0, 1). θ was bounded to (0, 1-2α BBUM ), where α BBUM was the critical threshold of BBUM-adjusted p values used for significance testing, to prevent the local minimum near θ = 1 where all upregulated points would be regarded as primary signal when there is very low or no primary signal. The four parameters were bounded by transforming the values using the logit (logodds) function. Given a parameter x with lower bound x lb and upper bound x ub , the logit function transforms a value as: For each dataset, fitting was initiated at each of six sets of fixed initial parameter values (Additional file 1: Table S2). The initial values for θ were linearly rescaled to its bounds. Out of the six attempts, the successfully converged solution with the highest total loglikelihood function value was chosen as the final solution.

Outlier detection and trimming
Due to the asymptotic behavior of the likelihood function at zero, BBUM model fitting can be prone to overcorrection for secondary effects when outliers with extremely low p values among downregulated miRNAs are present. To mitigate this, we developed and implemented a conservative method for outlier detection and trimming. The data were first preliminarily fit to the model with a wider bound for r at (0, 10). When very strong outliers were present among downregulated miRNAs, the algorithm would converge to a value of r > 1, which implied a stronger secondary effect than primary effect, violating the assumption of our model and suggesting the existence of unexpectedly strong signal in the background. An increasing number of downregulated miRNAs with the strongest p values were then iteratively trimmed as outliers until the algorithm converged to a value of r < 1, unless the condition was not met after trimming 5% of or 10 downregulated miRNAs, whichever was lower. BBUM FDR correction of p values using this outlier trimming method is specifically referred to as the modified BBUM method in this work, and is the default in the BBUM_corr and BBUM_DEcorr functions in the bbum R package from this work.

Significance testing
The expected FDR level of falsely calling either null or secondary signal data points as the primary signal can be calculated at any given cutoff for raw p values of upregulated data points. We employed a strategy for adjusting p values that resembled the one that DESeq2 adopts for the Benjamini-Hochberg procedure [1]. For every raw p value of an upregulated miRNA, we calculated the expected FDR value at that value using the BBUM model and denoted it as the BBUM-FDR-adjusted p value (p BBUM ) for significance testing. Thus, to control for FDR at a pre-determined cutoff, such as α BBUM = 0.05, all points with p BBUM < 0.05 would be called as significant. The expected FDR and the value of p BBUM were thus calculated as

Simulation of p values and benchmarking
To benchmark strategies for significance calling and correction, p values with primary and secondary effects were simulated using the BBUM model. For each simulation, the total number of p values, n, was generated from a uniform distribution, and the number of upregulated points was determined through a binomial distribution: The values of p were then modeled as a random variable X, which followed a compound distribution of the upregulated and downregulated halves, X p+ and X p− , respectively: For each half of the dataset, p values were simulated under respective BBUM models. If programmed outliers were simulated among downregulated points, the outliers were simulated as a separate beta component similarly to the primary signal, where The proportions of the mixture components were drawn from binomial distributions using the values of λ, θ, and θʹ, with θ' representing the θ parameter for outliers, where Each component of the BBUM model was modeled as previously described, where For each simulation, the BBUM parameters were randomly generated from uniform or exponential distributions over reasonable expected ranges of values: a, θ , r) .
Only simulations with at least three data points under primary effects were accepted, to allow sufficient true hit data points for benchmarking. For simulations with outliers, only simulations with at least one outlier were accepted.
True and false positives and negatives were identified by comparing the significance calling of every point to the BBUM distribution component that the point belonged to. Benchmarking statistics, such as FDR and sensitivity, were then calculated using the following equations (Fig. 1C): For each simulation, 6000 simulated datasets were generated. Confidence intervals for mean and CV statistics were generated by ordinary bootstrapping using the boot package in R. Datasets were resampled 3000 times, and nonparametric 95% confidence intervals were defined by the empirical bootstrap confidence intervals, using the "basic" method of the boot.ci function. Based on the central limit theorem, confidence intervals were presented as the mean deviation of the lower and upper intervals from the sample mean.

Artificial outliers for datasets
To assess the potential impact of outliers on different adjustment methods using empirical datasets from Shi et al. [10], we added to each dataset one to five extreme downregulated outliers with raw p values at 10 −300 .

Analysis of RNA-seq datasets
Read counts per protein-coding transcript were directly obtained from published RNA-seq datasets [18]. Briefly, these RNA-seq samples had been prepared from HeLa or HEK293FT cell lines transfected with respective miRNA duplexes using RNAiMAX (ThermoFisher, 13778150), and the libraries were prepared using NEXTFLEX Rapid Directional RNA-Seq Kit with poly(A)-selection beads (PelkinElmer, NOVA-5138-07). Reads were aligned to the human genome (reference assembly hg19) using STAR v2.2, and read counts were calculated using htseq-count. Twelve miRNAs were transfected in duplicate in HEK293FT cells, and 17 miRNAs were transfected in duplicate in HeLa cells.
To calculate differential expression of each mRNA upon miRNA transfection, the mRNA read counts observed following miRNA transfection were compared with those observed after the transfection of each of the other miRNAs in that cell line. Using these other datasets as the reference accounted for the global changes caused by miRNA transfection, including the effects of competition with endogenous miRNAs for loading into AGO ; Sensitivity = TP FN + TP proteins [18]. Mean fold changes and p values were calculated using DESeq2 (v1.32.0), using the default Wald test method, without the lfcShrink() function, and with independ-entFiltering = FALSE in results(). Benjamini-Hochberg-adjusted p values (p adj ) were calculated as default by DESeq2. Genes were filtered for having at least 5 read counts across all samples from the corresponding cell line. BBUM FDR-adjusted p values (p BBUM ) were calculated using the modified BBUM approach with all default settings in the BBUM_DEcorr function, and using downregulated mRNAs as the signal set and upregulated mRNAs as the background set.

TDMD
Target-mediated microRNA degradation BBUM Bi-beta uniform mixture FDR False discovery rate DE Differential expression CI Confidence interval
Additional file 1: Table S1. miRNAs in the Shi et al. datasets identified as ZSWIM8-sensitive. Table S2. Raw initial parameter values for BBUM model fitting. Figure S1. Application of the BBUM correction method to RNA-seq datasets measuring the effects of transfecting a miRNA into either HeLa or HEK293FT cells. A Plots of log 2 fold changes in mRNA levels observed upon transfection of the indicated miRNA duplexes, showing all 29 datasets analyzed. Colors are as in Figure 5. B Volcano plots of raw p values as a function of mRNA log 2 fold change, showing all 29 datasets analyzed. Colors are as in Figure 5